2 research outputs found
DeepMem: ML Models as storage channels and their (mis-)applications
Machine learning (ML) models are overparameterized to support generality and
avoid overfitting. Prior works have shown that these additional parameters can
be used for both malicious (e.g., hiding a model covertly within a trained
model) and beneficial purposes (e.g., watermarking a model). In this paper, we
propose a novel information theoretic perspective of the problem; we consider
the ML model as a storage channel with a capacity that increases with
overparameterization. Specifically, we consider a sender that embeds arbitrary
information in the model at training time, which can be extracted by a receiver
with a black-box access to the deployed model. We derive an upper bound on the
capacity of the channel based on the number of available parameters. We then
explore black-box write and read primitives that allow the attacker to: (i)
store data in an optimized way within the model by augmenting the training data
at the transmitter side, and (ii) to read it by querying the model after it is
deployed. We also analyze the detectability of the writing primitive and
consider a new version of the problem which takes information storage
covertness into account. Specifically, to obtain storage covertness, we
introduce a new constraint such that the data augmentation used for the write
primitives minimizes the distribution shift with the initial (baseline task)
distribution. This constraint introduces a level of "interference" with the
initial task, thereby limiting the channel's effective capacity. Therefore, we
develop optimizations to improve the capacity in this case, including a novel
ML-specific substitution based error correction protocol. We believe that the
proposed modeling of the problem offers new tools to better understand and
mitigate potential vulnerabilities of ML, especially in the context of
increasingly large models